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Abstract 

We study the self-organization of the conso- 
nant inventories through a complex network 
approach. We observe that the distribution 
of occurrence as well as co-occurrence of the 
consonants across languages follow a power- 
law behavior. The co-occurrence network of 
consonants exhibits a high clustering coeffi- 
cient. We propose four novel synthesis models 
for these networks (each of which is a refine- 
ment of the earlier) so as to successively match 
with higher accuracy (a) the above mentioned 
topological properties as well as (b) the lin- 
guistic property of feature economy exhibited 
by the consonant inventories. We conclude by 
arguing that a possible interpretation of this 
mechanism of network growth is the process 
of child language acquisition. Such models 
essentially increase our understanding of the 
structure of languages that is influenced by 
their evolutionary dynamics and this, in turn, 
can be extremely useful for building future 
NLP applications. 

1 Introduction 

A large number of regular patterns are ob- 
served across the sound inventories of human 
languages. These regularities are arguably a 
consequence of the self-organization that is 
instrumental in the emergence of these invento- 
ries ( |de Boer, 2000| ). Many attempts have been 
made by functional phonologists for explaining 
this self-organizing behavior through certain 
general principles such as maximal perceptual 



articulation ( [Lindbloman d Maddieso n, 1988 
de Boer, 2000| ), and ease of learnabil- 
ity ( |de Boer, 2000| ). In fact, there are a lot 
of studies that attempt to explain the emer- 
gence of the vowel inventories through the 
application of one or more of the above 
principles (Liljencrants and Lindblom, 1972 

|de Boer, 20001 ) . Some studies have also 

been carried out in the area of linguistics 
that seek to reason the observed patterns in 



the consonant inventories (Trubetzkoy, 1939 



Lindb lom and Maddieson, 1988; Boersma , 1998 



Clements, 2008 1. Nevertheless, most of these works 



are confined to certain individual principles rather 
than formulating a general theory describing the 
emergence of these regular patterns across the 
consonant inventories. 

The self-organization of the consonant invento- 
ries emerges due to an interaction of different forces 
acting upon them. In order to identify the na- 
ture of these interactions one has to understand the 
growth dynamics of these inventories. The theories 
of complex networks provide a number of growth 
models that have proved to be extremely success- 
ful in explaining the evolutionary dynamics of var- 
ious social ( [Newman, 2001 [ |Ramasco et al., 2004] ), 



biological ( [Jeong et al., 2000[ ) and other natural 
systems. The basic framework for the current 
study develops around two such complex net- 
works namely, the Phoneme-Language Network 



or PlaNet (Choudhury et al, 2006) and its one- 



contrast (Liljencrants and Lindblom, 1972), ease of 



mode projection, the Phoneme-Phoneme Network 
or PhoNet ( [Mukherjee et al. 2007a] ). We begin by 
analyzing some of the structural properties (Sec. [2]) 



of the networks and observe that the consonant 
nodes in both PlaNet and PhoNet follow a power- 
law-like degree distribution. Moreover, PhoNet 
is characterized by a high clustering coefficient, 
a property that has been found to be prevalent 



in many other social networks ( [Newman, 200 1[ 
Ramasco et al, 2004[ ). 



We propose four synthesis models for PlaNet 
(Sec.[3]>, each of which employ a variant of a prefer- 
ential attachment (Barabasi and Albert, 1999) based 
growth kernefj]. While the first two models are 
independent of the characteristic properties of the 
(consonant) nodes, the following two use them. 
These models are successively refined not only to 
reproduce the topological properties of PlaNet and 
PhoNet, but also to match the linguistic property of 



feature economy dBoersma, 1998 [ |Clements, 20081 ) 
that is observed across the consonant inventories. 
The underlying growth rules for each of these in- 
dividual models helps us to interpret the cause of the 
emergence of at least one (or more) of the aforemen- 
tioned properties. We conclude (Sec. [4]) by provid- 
ing a possible interpretation of the proposed math- 
ematical model that we finally develop in terms of 
child language acquisition. 

There are three major contributions of this work. 
Firstly, it provides a fascinating account of the struc- 
ture and the evolution of the human speech sound 
systems. Furthermore, the introduction of the node 
property based synthesis model is a significant con- 
tribution to the field of complex networks. On a 
broader perspective, this work shows how statisti- 
cal mechanics can be applied in understanding the 
structure of a linguistic system, which in turn can be 
extremely useful in developing future NLP applica- 
tions. 

2 Properties of the Consonant Inventories 

In this section, we briefly recapitulate the definitions 
of PlaNet and PhoNet, the data source, construction 
procedure for the networks and some of their impor- 
tant structural properties. We also revisit the con- 
cept of feature economy and the method used for its 
quantification. 




2 One Mode Projection 
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Figure 1 : Illustration of the nodes and edges of PlaNet 
and PhoNet. 



2.1 Structural Properties of the Consonant 
Networks 

PlaNet is a bipartite graph G = ( Vl, Vc, E p i ) con- 
sisting of two sets of nodes namely, Vl (labeled by 
the languages) and Vc (labeled by the consonants); 
E p i is the set of edges running between Vl and Vc- 
There is an edge e G E p i from a node v\ G Vl to 
a node v c G Vc iff the consonant c is present in the 
inventory of language I. 

PhoNet is the one-mode projection of PlaNet onto 
the consonant nodes i.e., a network of consonants in 
which two nodes are linked by an edge with weight 
as many times as they co-occur across languages. 
Hence, it can be represented by a graph G = (Vc, 
E p h ), where Vc is the set of consonant nodes and 
E p h is the set of edges connecting these nodes in G. 
There is an edge e G E p h if the two nodes (read 
consonants) that are connected by e co-occur in at 
least one language and the number of languages they 
co-occur in defines the weight of the edge e. Fig- 
ure Q] shows the nodes and the edges of PlaNet and 
PhoNet. 

Data Source and Network Construc- 
tion: Like many other earlier stud- 
ies (Ojencrants and Lindblom, 1972 



Lindblom and Maddieson, 1988 de Boer, 2000 



1 The word kernel here refers to the function or mathematical 
formula that drives the growth of the network. 



Hinskens and Weijer, 2003 ), we use the UCLA 
Phonological Segment Inventory Database (UP- 
SID) (Maddieson, 1984) as the source of our data. 
There are 317 languages in the database with a 
total of 541 consonants found across them. Each 
consonant is characterized by a set of phonological 
features (Trubetzk oy, 193 1) , which distinguishes 
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Table 1 : The table shows some of the important features 
listed in UPSID. Over 99% of the UPSID languages have 
bilabial, dental-alveolar and velar plosives. Furthermore, 
voiceless plosives outnumber the voiced ones (92% vs. 
67%). 93% of the languages have at least one fricative, 
97% have at least one nasal and 96% have at least one 
liquid. Approximants occur in fewer than 95% of the lan- 
guages. 



it from others. UPSID uses articulatory features 
to describe the consonants, which can be broadly 
categorized into three different types namely the 
manner of articulation, the place of articulation 
and phonation. Manner of articulation specifies 
how the flow of air takes place in the vocal tract 
during articulation of a consonant, whereas place 
of articulation specifies the active speech organ and 
also the place where it acts. Phonation describes the 
vibration of the vocal cords during the articulation 
of a consonant. Apart from these three major classes 
there are also some secondary articulatory features 
found in certain languages. There are around 52 
features listed in UPSID; the important ones are 
noted in Table [TJ Note that in UPSID the features 
are assumed to be binary-valued and therefore, each 
consonant can be represented by a binary vector. 

We have used UPSID in order to construct PlaNet 
and PhoNet. Consequently, \Vl\ = 317 (in PlaNet) 
and \Vc\ = 541. The number of edges in PlaNet and 
PhoNet are 7022 and 30412 respectively. 

Degree Distributions of PlaNet and PhoNet: 
The degree distribution is the fraction of nodes, de- 
noted by Pk, which have a degred^l greater than or 
equal to k dNewman, 2003 1 ). The degree distribu- 
tion of the consonant nodes in PlaNet and PhoNet 



are shown in Figure [2]in the log-log scale. Both the 
plots show a power-law behavior oc k~ a ) with 
exponential cut-offs towards the ends. The value of 
a is 0.71 for PlaNet and 0.89 for PhoNet. 

Clustering Coefficient of PhoNet: The cluster- 
ing coefficient for a node i is the proportion of links 
between the nodes that are the neighbors of i divided 
by the number of links that could possibly exist be- 
tween them ( [Newman, 2003| ). Since PhoNet is a 
weighted graph the above definition is suitably mod- 



ified by the one presented in ( |Barrat et al., 2004| ). 
According to this definition, the clustering coeffi- 
cient for a node i is, 



1 



E 
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(1) 

where j and I are neighbors of i; k{ represents the 
plain degree of the node i; u>ij, Wji and Wu denote 
the weights of the edges connecting nodes i and j, 
j and I, and i and / respectively; a%j, an, aji are 
boolean variables, which are true iff there is an edge 
between the nodes i and j, i and I, and j and I re- 
spectively. The clustering coefficient of the network 
(c av ) is equal to the average clustering coefficient 
of the nodes. The value of c av for PhoNet is 0.89, 
which is significantly higher than that of a random 
graph with the same number of nodes and edges 
(0.08). 

2.2 Linguistic Properties: Feature Economy 
and its Quantification 

The principle of feature economy states that lan- 
guages tend to use a small number of distinc- 
tive features and maximize their combinatorial pos- 
sibilities to generate a large number of conso- 
nants ( |Boersma, 19981 IClements, 20"08~j ). Stated 
differently, a given consonant will have a higher 
than expected chance of occurrence in inventories 
in which all of its features have already distinc- 
tively occurred in the other consonants. This prin- 
ciple immediately implies that the consonants cho- 
sen by a language should share a considerable num- 
ber of features among them. The quantification pro- 
cess, which is a refinement of the idea presented 



For a weighted graph like PhoNet, the degree of a node i is 
the sum of weights on the edges that are incident on i. 



in (Mukherjee et al.2007b), is as follows. 

Feature Entropy: For an inventory of size N, let 
there be pj consonants for which a particular feature 




Figure 2: Degree distribution (DD) of PlaNet along with that of PlaNet SJ/n obtained from Model I and II respectively; 
(b) DD of PhoNet along with that of PhoNet S j,„ obtained from Model I and II respectively. Both the plots are in 
log-log scale. 



/ (recall that we assume / to be binary-valued) is 
present and q j other consonants for which the same 
is absent. Therefore, the probability that a conso- 
nant (chosen uniformly at random from this inven- 
tory) contains the feature / is ^ and the probability 
that it does not contain the feature is (=1-^-). 
One can think of / as an independent random vari- 
able, which can take values 1 and 0, and and ^ 
define the probability distribution of /. Therefore, 
for any given inventory, we can define the binary en- 



tropy Hf (Sh annon and Weaver, 1949[ ) for the fea- 
ture / as 



tj - Pf ]n<x Pf 9/ w IS 
Hf --N l0g2 N~N l0g2 N 



(2) 



If F is the set of all features present in the conso- 
nants forming the inventory, thenfeature entropy Fe 
is the sum of the binary entropies with respect to all 
the features, that is 



F E 



N l ° g2 N N [ ° g2 N } 
(3) 

Since we have assumed that / is an independent 
random variable, Fe is the joint entropy of the sys- 
tem. In other words, Fe provides an estimate of the 
number of discriminative features present in the con- 
sonants of an inventory that a speaker (e.g., parent) 
has to communicate to a learner (e.g., child) during 
language transmission. The lower the value of Fe 



the higher is the feature economy. The curve marked 
as (R) in Figure [3] shows the average feature entropy 
of the consonant inventories of a particular sized] (y- 
axis) versus the inventory size (x-axis). 

3 Synthesis Models 

In this section, we describe four synthesis models 
that incrementally attempt to explain the emergence 
of the structural properties of PlaNet and PhoNet as 
well as the feature entropy exhibited by the conso- 
nant inventories. In all these models, we assume 
that the distribution of the consonant inventory size, 
i.e., the degrees of the language nodes in PlaNet, are 
known a priori. 

3.1 Model I: Preferential Attachment Kernel 

This model employs a modified version of the ker- 
nel described in ( [Choudhury et al., 2006 >, which is 
the only work in literature that attempts to explain 
the emergence of the consonant inventories in the 
framework of complex networks. 

Let us assume that a language node Lj € Vl has a 
degree k%. The consonant nodes in Vc are assumed 
to be unlabeled, i.e, they are not marked by the dis- 
tinctive features that characterize them. We first sort 



3 Let there be n inventories of a particular size k. The aver- 
age feature entropy of the inventories of size k is ^y]" Fe { , 
where Fe 4 signifies the feature entropy of the i th inventory of 
size k. 



the nodes L\ through L317 in the ascending order of 
their degrees. At each time step a node Lj , chosen 
in order, preferentially attaches itself with kj distinct 
nodes (call each such node Cj) of the set Vc- The 
probability Pr(Cj) with which the node Lj attaches 
itself to the node C, is given by, 



Pr(Ci) 



di a + e 



(4) 



c 



where, di is the current degree of the node Ci, V c is 
the set of nodes in Vc that are not already connected 
to Lj, e is the smoothing parameter that facilitates 
random attachments and a indicates whether the at- 
tachment kernel is sub-linear (a < 1), linear (a = 1) 
or super-linear (a > 1). Note that the modifica- 



the change those consonants that belong to lan- 
guages that are more prevalent among the speak- 
ers of a generation have higher chances of being 
transmitted to the speakers of the subsequent gen- 
erations (Blevins72004j). This heterogeneity in the 
choice of the consonants manifests itself as prefer- 
ential attachment. We conjecture that the value of a 
is a function of the societal structure and the cogni- 
tive capabilities of human beings. The exact nature 
of this function is currently not known and a topic 
for future research. The parameter e in this case 
may be thought of as modeling the randomness of 
the system. 

Nevertheless, the degree distribution of 
PhoNet 



synt 



tion from the earlier kernel (Choudhury et al., 2006 ) 



which is the one-mode projection 
of PlaNet SJ/n , does not match the real data well 
(see Figure The mean error between the two 
distributions is 0.45. Furthermore, the clustering 



is brought about by the introduction of a. The above 

process is repeated until all the language nodes Lj G coefficient of PhoNet syn is 0.55 and differs largety 
Vl get connected to kj consonant nodes (refer to 
Figure. 6 of (Choudhury et al., 2006[) for an illus- 



tration of the steps of the synthesis process). Thus, 
we have the synthesized version of PlaNet, which 
we shall call PlaNet SJ/n henceforth. 

The Simulation Results: We simulate the above 
model to obtain PlaNet syn for 100 different runs and 
average the results over all of them. We find that 
the degree distributions that emerge fit the empirical 
data well for a G [1.4,1.5] and e G [0.4,0.6], the best 
being at a = 1.44 and e = 0.5 (shown in Figure©. In 
fact, the mean erroi0 between the real and the syn- 
thesized distributions for the best choice of parame- 
ters is as small as 0.01. Note that this error in case of 



syn 

from that of PhoNet. The primary reason for this 
deviation in the results is that PhoNet exhibits strong 
patterns of co-occurrences (Mukherjee et al.2007ai 
and this fact is not taken into account by Model 
I. In order to circumvent the above problem, we 
introduce the concept of triad (i.e., fully connected 
triplet) formation and thereby refine the model in 
the following section. 

3.2 Model II: Kernel based on Triad Formation 



the model presented in (Choudhury et al., 2006 1 was 
0.03. Furthermore, as we shall see shortly, a super- 
linear kernel can explain various other topological 
properties more accurately than a linear kernel. 

In absence of preferential attachment i.e., when 
all the connections to the consonant nodes are 
equiprobable, the mean error rises to 0.35. 

A possible reason behind the success of this 
model is the fact that language is a constantly chang- 
ing system and preferential attachment plays a sig- 
nificant role in this change. For instance, during 



4 Mean error is denned as the average difference between the 
ordinate pairs (say y and y ) where the abscissas are equal. In 
other words, if there are N such ordinate pairs then the mean 

/ \y~ y 

error can be expressed as N . 



The triad model (Pelto maki and Alav a, 2006) builds 
up on the concept of neighborhood formation. Two 
consonant nodes C\ and C2 become neighbors if a 
language node at any step of the synthesis process 
attaches itself to both C\ and C2. Let the proba- 
bility of triad formation be denoted by p t . At each 
time step a language node Lj (chosen from the set 
of language nodes sorted in ascending order of their 
degrees) makes the first connection preferentially to 
a consonant node Cj G Vc to which Lj is not already 
connected following the distribution Pr(Cj). For 
the rest of the (kj-l) connections Lj attaches itself 
preferentially to only the neighbors of C, to which 
Lj is not yet connected with a probability p t . Conse- 
quently, Lj connects itself preferentially to the non- 
neighbors of Ci to which Lj is not yet connected 
with a probability (1 — p t ). The neighbor set of Ci 
gets updated accordingly. Note that every time the 
node Ci and its neighbors are chosen they together 



impose a clique on the one-mode projection. This 
phenomenon leads to the formation of a large num- 
ber of triangles in the one-mode projection thereby 
increasing the clustering coefficient of the resultant 
network. 

The Simulation Results: We carry out 100 dif- 
ferent simulation runs of the above model for a par- 
ticular set of parameter values to obtain PlaNet S2/n 
and average the results over all of them. We explore 
several parameter settings in the range as follows: 
a G [1,1.5] (in steps of 0.1), e G [0.2,0.4] (in steps of 
0.1) and p t G [0.70,0.95] (in steps of 0.05). We also 
observe that if we traverse any further along one or 
more of the dimensions of the parameter space then 
the results get worse. The best result emerges for 
a - 1.3, e = 0.3 and p t = 0.8. 

Figure [2] shows the degree distribution of the con- 
sonant nodes of PlaNet SJ/n and PlaNet. The mean 
error between the two distributions is 0.04 approx- 
imately and is therefore worse than the result ob- 
tained from Model I. Nevertheless, the average clus- 
tering coefficient of PhoNet syn in this case is 0.85, 
which is within 4.5% of that of PhoNet. Moreover, 
in this process the mean error between the degree 
distribution of PhoNet SJ/n and PhoNet (as illustrated 
in Figure [2]) has got reduced drastically from 0.45 to 
0.03. 

One can again find a possible association of this 
model with the phenomena of language change. If 
a group of consonants largely co-occur in the lan- 
guages of a generation of speakers then it is very 
likely that all of them get transmitted together in the 



subsequent generations ( jBlevins, 2004| ). The triad 
formation probability ensures that if a pair of con- 
sonant nodes become neighbors of each other in 
a particular step of the synthesis process then the 
choice of such a pair should be highly preferred in 
the subsequent steps of the process. This is coherent 
with the aforementioned phenomenon of transmis- 
sion of consonants in groups over linguistic genera- 
tions. Since the value of p t that we obtain is quite 
high, it may be argued that such transmissions are 
largely prevalent in nature. 

Although Model II reproduces the structural prop- 
erties of PlaNet and PhoNet quite accurately, as we 
shall see shortly, it fails to generate inventories that 
closely match the real ones in terms of feature en- 
tropy. However, at this point, recall that Model II as- 
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M3: Inventories obtained from Modellll 




R: Real Inventories 
M4: Inventories obtained from Model IV (seed of 30 languages) 
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Figure 3: Average feature entropy of the inventories of a 
particular size (y-axis) versus the inventory size (x-axis). 



sumes that the consonant nodes are unlabeled; there- 
fore, the inventories that are produced as a result 
of the synthesis are composed of consonants, which 
unlike the real inventories, are not marked by their 
distinctive features. In order to label them we per- 
form the following, 
The Labeling Scheme: 

1. Sort the consonants of UPSID in the decreasing 
order of their frequency of occurrence and call this 
list of consonants ListC[\ ■ ■ ■ 541], 

2. Sort the Vc nodes of PlaNet^,, in decreas- 
ing order of their degree and call this list of nodes 
ListN[l ■ ■ • 541], 

3. Vi<j<54i ListN[i] < — ListC[i] 

The Figure [3] indicates that the curve for the real 
inventories (R) and those obtained from Model II 
(M2) are significantly different from each other. 
This difference arises due to the fact that in Model II, 
the choice of a consonant from the set of neighbors 
is solely degree-dependent, where the relationships 
between the features are not taken into considera- 
tion. Therefore, in order to eliminate this problem, 
we introduce the model using the feature-based ker- 
nel in the next section. 

3.3 Model III: Feature-based Kernel 

In this model, we assume that each of the consonant 
nodes are labeled, that is each of them are marked by 
a set of distinctive features. The attachment kernel in 
this case has two components one of which is prefer- 
ential while the other favors the choice of those con- 



sonants that are at a low feature distance (the number 
of feature positions they differ at) from the already 
chosen ones. Let us denote the feature distance be- 
tween two consonants Cj and Q by D(Ci, CJ. We 
define the affinity, A(Ci, Cj), between Cj and C { as 



A(Ci,C f ) 



1 



(5) 



Therefore, the lower the feature distance between d 
and C i the higher is the affinity between them. 

At each time step a language node establishes the 
first connection with a consonant node (say Cj) pref- 
erentially following the distribution Pr(Cj) like the 
previous models. The rest of the connections to any 
arbitrary consonant node C i (not yet connected to 
the language node) are made following the distribu- 
tion (1 - w)Pr{C' i ) + wPr aff (Ci, C-), where 



(6) 



and < uu < 1. 

Simulation Results: We perform 100 different 
simulation runs of the above model for a particular 
set of parameter values to obtain PlaNet SJ/n and av- 
erage the results over all of them. We explore differ- 
ent parameter settings in the range as follows: a € 
[1,2] (in steps of 0.1), e € [0.1,1] (in steps of 0.1) 
and w G [0.1,0.5] (in steps of 0.05). The best result 
in terms of the structural properties of PlaNet and 
PhoNet emerges for a = 1.6, e = 0.3 and w = 0.2. 

In this case, the mean error between the degree 
distribution curves for PlaNet SJ/n and PlaNet is 0.05 
and that between of PhoNet SJ/n and PhoNet is 0.02. 
Furthermore, the clustering coefficient of PhoNet SJ/n 
in this case is 0.84, which is within 5.6% of that of 
PhoNet. The above results show that the structural 
properties of the synthesized networks in this case 
are quite similar to those obtained through the triad 
model. Nevertheless, the average feature entropy of 
the inventories produced (see curve M3 in Figure [3]) 
are more close to that of the real ones now (for quan- 
titative comparison see Table [2]). 

Therefore, it turns out that the groups of con- 
sonants that largely co-occur in the languages of 
a linguistic generation are actually driven by the 
principle of feature economy (see (Clements, 2008 
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Mukherjee et al.2007a I for details). 



Table 2: Important results obtained from each of the mod- 
els. ME: Mean Error, DD: Degree Distribution. 



However, note that even for Model III the nodes 
that are chosen for attachment in the initial stages of 
the synthesis process are arbitrary and consequently, 
the labels of the nodes of PlaNet SJ/n do not have 
a one-to-one correspondence with that of PlaNet, 
which is the main reason behind the difference in 
the result between them. In order to overcome this 
problem we can make use of a small set of real in- 
ventories to bootstrap the model. 

3.4 Model IV: Feature-based Kernel and 
Bootstrapping 

In order to create a bias towards the labeling scheme 
prevalent in PlaNet, we use 30 (around 10% of 
the) real languages as a seed (chosen randomly) for 
Model III; i.e., they are used by the model for boot- 
strapping. The idea is summarized below. 

1 . Select 30 real inventories at random and construct 
a PlaNet from them. Call this network the initial 
PlaNet SJ/n . 

2. The rest of the language nodes are incrementally 
added to this initial PlaNet syn using Model III. 

Simulation Results: The best fit now emerges 
at q = 1.35, e = 0.3 and w = 0.15. The 
mean error between the degree distribution of PlaNet 
and PlaNet SJ/n is 0.05 and that between PhoNet and 
PhoNet SJ/n is 0.02. The clustering coefficient of 
PhoNet SJ/n is 0.83 in this case (within 6.7% of that 
of PhoNet). 

The inventories that are produced as a result of the 
bootstrapping have an average feature entropy closer 
to the real inventories (see curve M4 in Figure [3]) 
than the earlier models. Hence, we find that this im- 
proved labeling strategy brings about a global better- 
ment in our results unlike in the previous cases. The 
larger the number of languages used for the purpose 
of bootstrapping the better are the results mainly in 
terms of the match in the feature entropy curves. 



4 Conclusion 

We dedicated the preceding sections of this article 
to analyze and synthesize the consonant inventories 
of the world's languages in the framework of a com- 
plex network. Table [2] summarizes the results ob- 
tained from the four models so that the reader can 
easily compare them. Some of our important obser- 
vations are 

• The distribution of occurrence and co-occurrence 
of consonants across languages roughly follow a 
power law, 

• The co-occurrence network of consonants has a 
large clustering coefficient, 

• Groups of consonants that largely co-occur across 
languages are driven by feature economy (which can 
be expressed through feature entropy), 

• Each of the above properties emerges due to dif- 
ferent reasons, which are successively unfurled by 
our models. 

So far, we have tried to explain the physical sig- 
nificance of our models in terms of the process of 
language change. Language change is a collective 
phenomenon that functions at the level of a popu- 
lation of speakers (Steels, 2000). Nevertheless, it 
is also possible to explain the significance of the 
models at the level of an individual, primarily in 
terms of the process of language acquisition, which 
largely governs the course of language change. In 
the initial years of language development every child 
passes through a stage called babbling during which 
he/she learns to produce non-meaningful sequences 
of consonants and vowels, some of which are not 
even used in the language to which they are ex- 
posed ( [Jakobson, 1968| |Locke, 1983] ). Clear pref- 
erences can be observed for learning certain sounds 
such as plosives and nasals, whereas fricatives and 
liquids are avoided. In fact, this hierarchy of pref- 
erence during the babbling stage follows the cross- 
linguistic frequency distribution of the consonants. 
This innate frequency dependent preference towards 
certain phonemes might be because of phonetic rea- 
sons (i.e., for articulatory /perceptual benefits). In 
all our models, this innate preference gets cap- 
tured through the process of preferential attachment. 
However, at the same time, in the context of learning 
a particular inventory the ease of learning the indi- 
vidual consonants also plays an important role. The 



lower the number of new feature distinctions to be 
learnt, the higher the ease of learning the consonant. 
Therefore, there are two orthogonal preferences: (a) 
the occurrence frequency dependent preference (that 
is innate), and (b) the feature-dependent preference 
(that increases the ease of learning), which are in- 
strumental in the acquisition of the inventories. The 
feature-based kernel is essentially a linear combina- 
tion of these two mutually orthogonal factors. 
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